require("jsonlite")
require("RCurl")
require(extrafont)
require(ggplot2)
require(tidyr)
require(dplyr)

Procedure for getting our csv data onto Oracle.

1. Require packages.

2. Modify R_ETL.R: set working directory and file path, read our csv, use str() to determine which variables are continuous, and change measures accordingly.

3. Run the code to get the SQL CREATE TABLE command.

4. Use the command to create a table in SQL Developer to create a new table.

5. Open the table, click “Actions,” and “Import Data…”


6. Follow through the data import wizard (Match by Name)

7. All data from the csv file data should now be entered into the SQL table you created.

Data Frame (Olympics)

source("../01 Data/olympicsdataframe.R", echo = TRUE)
## 
## > require("jsonlite")
## 
## > require("RCurl")
## 
## > require(extrafont)
## 
## > require(ggplot2)
## 
## > olympics <- data.frame(fromJSON(getURL(URLencode(gsub("\n", 
## +     " ", "skipper.cs.utexas.edu:5001/rest/native/?query=\n\"select * \nfrom OLYMPICS; .... [TRUNCATED] 
## 
## > head(olympics)
##   HOST_CITY      SPORT  DISCIPLINE           ATHLETE NOC GENDER
## 1    Athens Gymnastics Artistic G.    HOFMANN, Fritz GER    Men
## 2    Athens Gymnastics Artistic G. MANTEUFFEL, Fritz GER    Men
## 3    Athens Gymnastics Artistic G.    NEUKIRCH, Karl GER    Men
## 4    Athens Gymnastics Artistic G.    RSTEL, Richard GER    Men
## 5    Athens Gymnastics Artistic G.    SCHUFT, Gustav GER    Men
## 6    Athens Gymnastics Artistic G.    SCHUMANN, Carl GER    Men
##                 EVENT MEDAL COUNTRY   YR TTL_G_YR_W TTL_G_YR_M TOTAL_G_YR
## 1 team, parallel bars  Gold Germany 1896          0         64         64
## 2 team, parallel bars  Gold Germany 1896          0         64         64
## 3 team, parallel bars  Gold Germany 1896          0         64         64
## 4 team, parallel bars  Gold Germany 1896          0         64         64
## 5 team, parallel bars  Gold Germany 1896          0         64         64
## 6 team, parallel bars  Gold Germany 1896          0         64         64
##   TTL_S_YR_W TTL_S_YR_M TOTAL_S_YR TTL_B_YR_W TTL_B_YR_M TTK_B_YR
## 1          0         47         47          0         40       40
## 2          0         47         47          0         40       40
## 3          0         47         47          0         40       40
## 4          0         47         47          0         40       40
## 5          0         47         47          0         40       40
## 6          0         47         47          0         40       40
##   TTL_MEDALS_YR TTL_MEDALS_YR_M TTL_MEDALS_W TTL_G_COUNTRY_OVERALL
## 1           151             151            0                   407
## 2           151             151            0                   407
## 3           151             151            0                   407
## 4           151             151            0                   407
## 5           151             151            0                   407
## 6           151             151            0                   407
##   TTL_S_COUNTRY_OVERALL TTL_B_COUNTRY_OVERALL TTL_MEDALS_COUNTRY_OVERALL
## 1                   350                   454                       1211
## 2                   350                   454                       1211
## 3                   350                   454                       1211
## 4                   350                   454                       1211
## 5                   350                   454                       1211
## 6                   350                   454                       1211
## 
## > summary(olympics)
##        HOST_CITY            SPORT             DISCIPLINE   
##  Athens     : 2149   Aquatics  : 3828   Athletics  : 3448  
##  Los Angeles: 2074   Athletics : 3448   Rowing     : 2523  
##  Beijing    : 2042   Rowing    : 2523   Swimming   : 2439  
##  Sydney     : 2015   Gymnastics: 2214   Artistic G.: 2103  
##  Atlanta    : 1859   Fencing   : 1547   Fencing    : 1547  
##  Barcelona  : 1705   Football  : 1387   Football   : 1387  
##  (Other)    :17372   (Other)   :14269   (Other)    :15769  
##                  ATHLETE           NOC          GENDER     
##  LATYNINA, Larisa    :   18   USA    : 4335   Men  :21721  
##  PHELPS, Michael     :   16   URS    : 2049   Women: 7495  
##  ANDRIANOV, Nikolay  :   15   GBR    : 1594                
##  MANGIAROTTI, Edoardo:   13   FRA    : 1314                
##  ONO, Takashi        :   13   ITA    : 1228                
##  SHAKHLIN, Boris     :   13   GER    : 1211                
##  (Other)             :29128   (Other):17485                
##                       EVENT          MEDAL                COUNTRY     
##  football                : 1387   Bronze:9689   United States : 4335  
##  hockey                  : 1325   Gold  :9850   Soviet Union  : 2049  
##  team competition        : 1018   Silver:9677   United Kingdom: 1594  
##  basketball              :  940                 France        : 1314  
##  eight with coxswain (8+):  918                 Italy         : 1228  
##  handball                :  886                 Germany       : 1211  
##  (Other)                 :22742                 (Other)       :17485  
##        YR         TTL_G_YR_W      TTL_G_YR_M      TOTAL_G_YR   
##  Min.   :1896   Min.   :  0.0   Min.   : 64.0   Min.   : 64.0  
##  1st Qu.:1948   1st Qu.: 32.0   1st Qu.:270.0   1st Qu.:301.0  
##  Median :1976   Median :127.0   Median :317.0   Median :460.0  
##  Mean   :1968   Mean   :136.8   Mean   :312.6   Mean   :449.4  
##  3rd Qu.:1996   3rd Qu.:259.0   3rd Qu.:361.0   3rd Qu.:615.0  
##  Max.   :2008   Max.   :307.0   Max.   :482.0   Max.   :669.0  
##                                                                
##    TTL_S_YR_W      TTL_S_YR_M      TOTAL_S_YR      TTL_B_YR_W   
##  Min.   :  0.0   Min.   : 47.0   Min.   : 47.0   Min.   :  0.0  
##  1st Qu.: 32.0   1st Qu.:268.0   1st Qu.:294.0   1st Qu.: 32.0  
##  Median :128.0   Median :313.0   Median :446.0   Median :126.0  
##  Mean   :137.3   Mean   :306.3   Mean   :443.5   Mean   :138.9  
##  3rd Qu.:258.0   3rd Qu.:357.0   3rd Qu.:610.0   3rd Qu.:260.0  
##  Max.   :306.0   Max.   :432.0   Max.   :667.0   Max.   :319.0  
##                                                                 
##    TTL_B_YR_M       TTK_B_YR     TTL_MEDALS_YR  TTL_MEDALS_YR_M 
##  Min.   : 40.0   Min.   : 40.0   Min.   : 151   Min.   : 151.0  
##  1st Qu.:254.0   1st Qu.:297.0   1st Qu.: 885   1st Qu.: 791.0  
##  Median :336.0   Median :447.0   Median :1305   Median : 958.0  
##  Mean   :312.7   Mean   :451.6   Mean   :1345   Mean   : 931.5  
##  3rd Qu.:376.0   3rd Qu.:634.0   3rd Qu.:1859   3rd Qu.:1099.0  
##  Max.   :391.0   Max.   :710.0   Max.   :2042   Max.   :1255.0  
##                                                                 
##   TTL_MEDALS_W TTL_G_COUNTRY_OVERALL TTL_S_COUNTRY_OVERALL
##  Min.   :  0   Min.   :   0.0        Min.   :   0.0       
##  1st Qu.: 96   1st Qu.: 143.0        1st Qu.: 174.0       
##  Median :381   Median : 329.0        Median : 308.0       
##  Mean   :413   Mean   : 551.2        Mean   : 418.9       
##  3rd Qu.:777   3rd Qu.: 498.0        3rd Qu.: 591.0       
##  Max.   :932   Max.   :2088.0        Max.   :1195.0       
##                                                           
##  TTL_B_COUNTRY_OVERALL TTL_MEDALS_COUNTRY_OVERALL
##  Min.   :   0.0        Min.   :   1              
##  1st Qu.: 152.0        1st Qu.: 466              
##  Median : 325.0        Median :1021              
##  Mean   : 396.8        Mean   :1367              
##  3rd Qu.: 505.0        3rd Qu.:1594              
##  Max.   :1052.0        Max.   :4335              
## 

Visualizations

Barchart

Procedure:
* Columns : ATTR(Total Metals Country Overall)
* Rows : Country
* Marks : Text -> ATTR(Total Metals Country Overall)
* Reference Line : Average of 184

Analysis:
This barchart shows the total number of medals a country has won throughout all the games each country has participated in, and a reference line that shows the average number of medals.

Boxplot

Procedure:
* Columns : Country
* Rows : Year
* Filters : Counter -> East Germany, Germany, Russia, Soviet Union, West Germany

Analysis:
This boxplot shows the years during which a country participated in the Olympics. This graph accurately shows the period Germany broke up into East and West Germany as well as the year in which the Soviet Union became Russia.

Histogram

Procedure:
* Columns : Year (bin)
* Rows : CNT(Year)
* Filters : Year -> 1896-2008
* Marks : Color -> Metal : Bronze, Silver, Gold
Text -> Host City

Analysis:
This histogram shows the number of medals awarded at each edition of the Olympics. We can see an increase in the medals given out, meaning that the games expanded to include a larger variety of sports. The 1916 Olympics, scheduled to be held in Berlin, Germany; were cancelled due to WW1. The 1940 and 1944 games were cancelled due to WW2.

Scatterplot

Procedure:
* (2 green things - measures)
* Columns : Total Metals Per Year Men
* Rows : Total Metals Per Year Women
* Marks : Color -> Year : 1896-2008
* Add a trend Line

Analysis:
The scatterplot shows how the ratio of medals won by women to the medals won by in a given year progressed throughout the history of the Olympic Games. This plot shows that women did not participate during the first Olympic games at Athens. The ratio started out low when Women began to compete, and slowly increased. This tells us that the Olypics began to incorporate more sports for women to compete in.

Crosstab map with KPI

Procedure:
* Columns : Longitude
* Rows : Latitude
* Filters : Country -> Excluded US and Soviet Union (outliers)
* Marks : Color -> AGG(Country Metal KPI) : Red to Green
* Filled Map
* KPI determined by SUM(Total Metals Country Overall) / 184. 184 is the average amount of number of metals won per country.

Analysis:
This map shows the total number of medals won by each country throughout the history of the Olympics, relative to all other participating countries.

*The US and Soviet Union are excluded as they had won far more than the other countries.

Shiny App:

https://ronly.shinyapps.io/DV_FinalProject